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Abstract 


An  overview  and  performance  summary  of  an  Automated  Target  Recognition  (ATR) 
algorithm  based  on  spot  Synthetic  Aperture  Radar  (SAR)  imagery  is  described  in  this  report. 
Feature  extraction  and  classification  are  very  important  steps  in  the  ATR  process.  In  this 
algorithm,  the  two  dimensional  wavelet  decomposition  method  was  applied  to  SAR  targets  to 
extract  features.  Selection  of  an  appropriate  mother  wavelet  was  done  by  testing  various 
wavelets  and  selecting  the  one  which  produced  the  smallest  variation  between  features  for  the 
same  target  types,  and  the  largest  variation  between  features  for  different  target  types.  After 
extensive  testing,  the  Reverse  Biorthogonal  was  selected  as  the  best  mother  wavelet  for  this 
application.  Second  level  approximation  coefficients  were  used  as  features,  and  were  fed  into 
a  Multi  Layer  Pcrceptron  (MLP)  neural  network  (NN)  for  classification.  The  MLP  KN  was 
trained  using  a  supervised  method,  the  standard  delta  rule.  The  classification  results  are  shown 
using  Receiver  Operation  Characteristic  (ROC)  curves  and  Confusion  Matrices.  The  analysed 
result  shows  that  the  Reverse  biorthogonal  wavelet  features  are  as  good  as  two-dimensional 
Fast  Fourier  Transform  features  in  the  MSTAR  (Moving  and  Stationary  Target  Acquisition 
and  Recognition)  dataset  application.  Results  also  show  that  including  confuscrs  (objects  that 
the  ATR  algorithm  is  not  intended  to  classify)  in  the  training  dataset  reduces  false  alarm 
because  the  classifier  has  learned  to  reject  confusers  during  the  training  process. 


Resume 


Ce  rapport  presente  un  aper^u  et  un  resume  des  performances  d’un  algorithme  de 
reconnaissance  automatique  des  cibles  (ATR)  fonde  sur  fimagerie  RSO  (radar  a  synthese 
d’ouverture)  ponctuelle.  L’extraction  et  la  classification  des  caracteristiques  sont  des  etapes 
tres  importantes  du  processus  ATR.  Dans  cet  algorithme,  la  methode  de  decomposition  en 
ondelettes  bidimensionnelles  a  ete  appliquee  a  des  cibles  RSO  pour  f  extraction  de 
caracteristiques.  On  a  selectionne  une  ondelette  mere  appropriee  en  testant  diverses  ondelettes 
puis  en  selectionnant  celle  qui  produisait  la  plus  petite  variation  entre  les  caracteristiques  pour 
les  memes  types  de  cibles,  et  la  plus  grande  variation  entre  les  caracteristiques  pour  des  types 
de  cibles  differents  :  apres  des  essais  a  grande  echelle,  on  a  retenu  f  ondelette  inverse 
biorthogonale  aux  fins  de  f  application.  Des  coefficients  d’approximation  de  deuxieme  niveau 
ont  ete  utilises  comme  caracteristiques  et  introduits  dans  un  reseau  neuronal  perceptron 
multicouches  (MLP)  aux  fins  de  la  classification.  Le  reseau  MLP  a  ete  entraine  a  faide  d’une 
methode  supervisee,  soit  la  regie  delta  standard.  Les  resultats  de  la  classification  sont  figures 
au  moyen  de  courbes  FER  (fonction  d’efficacite  du  recepteur )  et  de  matrices  de  confusion.  Les 
resultats  analyses  montrent  que  les  caracteristiques  de  V ondelette  inverse  biorthogonale  sont 
aussi  bonnes  que  les  caracteristiques  TFR  (transformee  de  Fourier  rapide)  2D  de  Papplication 
d’ensemble  de  donnees  MSTAR  (acquisition  et  reconnaissance  de  cibles  mobiles  et  fixes).  Us 
montrent  en  outre  que  f  inclusion  d’elements  de  confusion  (objets  que  falgorithme  ATR  n’est 
pas  cense  classifier)  dans  Pensemble  de  donnees  d’entrainement  reduit  les  fausses  alarmes,  car 
le  classificateur  apprend  ainsi  a  rejeter  les  elements  de  confusion  lors  du  processus 
d’entrainement. 
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Executive  summary 


Synthetic  Aperture  Radar  (SAR)  is  a  very  useful  imaging  sensor  for  defense  application 
because  it  operates  at  all  times  of  day  and  under  all  the  weather  conditions.  Because  the  image 
produced  is  based  on  the  backscatter  of  a  high  frequency  incident  beam,  however,  targets 
located  in  imaging  are  often  difficult  to  identify.  Collection  of  radar  imagery  increases  every 
year  as  new  systems  are  being  deployed  with  radar  imaging  capability  (CP- 140,  Predator,  etc.) 
and  there  is  no  time  to  go  through  the  collected  data  manually.  The  intention  of  the  Automated 
Target  Recognition  (ATR)  method  is  to  bring  the  attention  of  the  image  analyst  to  any 
potential  targets  that  may  exist  in  an  image  and  to  pass  on  all  the  known  information  about 
them  to  the  image  analyst.  The  known  information  can  be  target  type,  size,  etc.  The  ATR 
process  can  be  broken  into  many  parts,  of  which  feature  extraction  and  classification  are  two 
of  the  most  fundamental.  Of  the  many  feature  extraction  and  classification  algorithms  that  are 
available,  the  two  dimensional  wavelet  feature  extraction  and  Multi  Layer  Perceptron  (MLP) 
Neural  Network  (NN)  classification  algorithms  have  been  used  successfully  in  many 
applications.  These  techniques  can  be  extended  for  use  with  SAR  imagery. 

For  the  algorithm  described  in  this  document,  the  two  dimensional  wavelet  algorithm  is 
applied  to  SAR  images  to  extract  significant  signatures  (features)  from  each  target.  The  MLP 
NN  is  then  used  to  identify  the  types  of  a  given  target,  based  on  these  signatures.  The  optimal 
wavelet  for  feature  extraction  was  determined  by  comparing  the  feature  variability  for  similar 
and  different  target  types  produced  by  various  wavelets.  The  Reverse  Biorthogonal  Wavelet 
was  selected  as  the  best  since  it  produced  the  lowest  feature  variability  for  similar  target  types, 
and  the  highest  variability  for  different  target  types.  The  ATR  algorithm  was  implemented  in 
Matlab  using  the  two  dimensional  wavelet  transform  defined  in  the  wavelet  toolbox,  and  the 
MLP  neural  network  was  developed  by  DRDC  Ottawa.  To  evaluate  the  ATR  algorithm, 
Receiver  Operation  Characteristic  (ROC)  curves  and  confusion  matrices  were  used.  The  ROC 
curve  shows  the  relationship  between  the  percentage  of  correct  detections  and  percentage  of 
false  alarms.  The  confusion  matrix  shows  the  number  of  targets  that  were  correctly 
recognized,  misclassified  and  rejected. 

The  results  show  that  including  non-target  vehicles  ("confusers")  in  training  set  decreases  the 
percentage  of  false  alarm  by  more  than  79%.  This  shows  that  inclusion  of  expected  confusers 
in  the  training  dataset  will  reduce  the  false  alarm.  The  percentage  of  correct  classification  of 
detected  targets  was  86.4%.  Therefore,  wavelet  feature  extraction  algorithm  has  some  ability 
to  separate  different  target  types  and  it  can  be  used  in  ATR  system  to  improve  recognition  rate. 

Future  work  includes  the  incorporation  of  these  algorithms  with  other  ATR  algorithms  to 
improve  the  classification  performance.  In  addition  to  incorporating  more  than  one  feature 
extraction  algorithms,  the  dominant  features  (best  features  that  representing  the  target  type) 
should  be  selected  to  reduce  the  dimensionality  of  the  input  to  a  classifier. 
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Sommaire 


Le  radar  a  synthese  d’ouverture  (RSO)  est  un  capteur  d’imagerie  tres  utile  aux  fins 
duplications  de  defense,  car  il  fonctionne  a  toute  heure  du  jour  et  dans  toutes  les  conditions 
meteorologiques.  Toutefois,  1’image  etant  produite  a  partir  de  la  retrodiffusion  d’un  faisceau 
haute  frequence  incident,  les  cibles  imagees  sont  souvent  difficiles  a  identifier.  La  collecte 
damages  radar  s’accroit  d’annee  en  annee  a  mesure  que  sont  deployes  de  nouveaux  systemes 
dotes  de  fimagerie  radar  (CP-140,  Predator,  etc.),  et  le  temps  manque  pour  l’examen  manuel 
des  donnees  recueillies.  La  methode  ATR  (reconnaissance  automatique  des  cibles)  vise  a 
signaler  a  l’analyste  toutes  les  cibles  qui  peuvent  etre  indiquees  sur  une  image  et  a  lui 
transmettre  toute  f  information  connue  sur  celles-ci  :  type,  taille,  etc.  Le  processus  ATR  peut 
se  diviser  en  plusieurs  parties,  les  deux  plus  importantes  etant  V extraction  et  la  classification. 
Parmi  tous  les  algorithmes  d’extraction  et  de  classification  existants,  les  algorithmes 
d’extraction  de  caracteristiques  au  moyen  d’ondelettes  bidimensionnelles  et  les  algorithmes  de 
classification  au  moyen  de  reseau  neuronaux  MLP  se  sont  averes  efficaces  dans  de  nombreuses 
applications.  Ces  techniques  peuvent  egalement  etre  utilisees  avec  T imageric  RSO. 

En  ce  qui  concerne  falgorithme  decrit  dans  le  present  document,  l’algorithme  utilisant 
f  ondelette  bidimensionnelle  est  applique  a  des  images  RSO  pour  extraire  des  signatures 
d’interet  (caracteristiques)  de  chaque  cible.  Le  reseau  neuronal  MLP  permet  ensuite 
d’ identifier  les  types  d’une  cible  donnee  en  fonction  de  ces  signatures.  On  a  determine 
f  ondelette  optimale  pour  1 ’extraction  de  signatures  en  comparant  la  variability  des 
caracteristiques  pour  des  types  de  cibles  semblables  et  differentes  produites  par  diverses 
ondelettes.  L’ondelette  inverse  biorthogonale  a  ete  selectionnee  parce  qu’elle  produisait  la  plus 
faible  variability  des  caracteristiques  pour  des  types  de  cibles  semblables,  et  la  plus  grande 
variability  pour  des  types  de  cibles  differents.  L’algorithme  ATR  a  etc  mis  en  oeuvre  dans 
Matlab  en  utilisant  la  transformee  d’ondelette  bidimensionnelle  definie  dans  la  boite  a  outils 
d’ondelettes,  et  le  reseau  neuronal  MLP  a  ete  developpe  par  RDDC  Ottawa.  Pour  evaluer 
falgorithme  ATR,  on  a  utilise  des  courbes  FER  (fonction  d’efficacite  du  rccepteur)  et  des 
matrices  de  confusion.  Les  courbes  ROC  montrent  la  relation  entre  le  pourcentage  de 
detections  correctes  et  le  pourcentage  de  fausses  alarmes.  La  matrice  de  confusion  indique  le 
nombre  de  cibles  reconnues,  mal  classees  et  rejetees. 

Les  resultats  indiquent  que  f  inclusion  de  vehicules  non  cibles  (elements  de  confusion)  dans 
[’ensemble  de  donnees  d’entrainement  reduit  le  pourcentage  de  fausses  alarmes  de  plus 
de  79  %.  Cela  montre  que  le  fait  d’inclure  des  elements  de  confusion  prevus  dans  fensemble 
de  donnees  d’entrainement  reduit  le  nombre  des  fausses  alarmes.  Le  pourcentage  de 
classification  correcte  des  cibles  detectees  etait  de  86.4  %.  Par  consequent,  falgorithme 
d’extraction  utilisant  les  ondelettes  permet  dans  une  certaine  mesure  de  separer  differents  types 
de  cibles  et  peut  etre  mis  a  profit  dans  un  systeme  ATR  pour  ameliorer  le  pourcentage  de 
reconnaissance. 

Les  recherches  futures  comprennent  f  integration  des  algorithmes  precites  avec  d’autres 
algorithmes  ATR  pour  ameliorer  les  performances  de  classification.  Outre  f  integration  de 
plusieurs  algorithmes  d’extraction  de  caracteristiques,  il  faudrait  selectionner  les 
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caracteristiques  dominantes  (caracteristiques  les  plus  representatives  du  type  de  cible)  de 
maniere  a  reduire  le  volume  des  entrees  dans  un  classificateur. 
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1.0  Introduction 


A  Synthetic  Aperture  Radar  (SAR)  system  transmits  microwaves  and  records  the  reflected 
signals  from  the  scene  being  imaged.  The  SAR  imaging  technique  is  significantly  different 
from  optical  techniques  where  the  image  acquired  is  based  on  the  energy  emitted  by  objects  in 
the  sensor’s  field  of  view  (FOV).  A  complete  description  of  SAR  imaging  can  be  found  in 
[1,2,3].  The  brightness  of  a  pixel  in  a  SAR  image  depends  on  strength  of  the  reflected  signal: 
high  strength  signals  appear  brighter  than  low  strength  signals.  The  strength  of  the  reflected 
signal  depends  on  many  factors  [4, 5, 6, 7]  such  as  size,  topography,  and  water  content  of 
objects,  as  well  as  radar  wavelength,  signal  polarization,  and  incident  angle.  The  reflected 
signal’s  properties  help  Automatic  Target  Recognition  (ATR)  systems  in  identifying  unknown 
targets.  In  recent  years,  the  research  community  has  increased  the  use  of  SAR  imagery  in 
ATR  development  [8,9,10].  SAR  ATR  systems,  however,  are  still  in  the  developmental  stage 
and  it  will  be  some  time  until  they  are  fully  operational. 

A  SAR-based  ATR  system  requires  a  fast  and  effective  discriminator  to  detect  a  target  and 
recognize  the  type  of  target  from  the  radar  return  [II].  This  study  concentrates  on  the  target 
recognition  process.  The  target  recognition  process  as  illustrated  in  Figure  1,  can  be  broken 
into  meaningful  processing  steps:  preprocessing,  feature  extraction  and  classification. 


Figure  1:  Block  diagram  of  target  recognition  processing  steps. 


Pattern  recognition  systems  have  been  developed  and  applied  to  many  applications,  including 
optical  character  recognition  [12-16]  and  data  mining  [17-23].  Methods  such  as  Neural 
Network  (NN)  [24-28]  and  wavelet  transforms  [29,30,31]  are  applied  in  SAR  applications. 
Both  techniques;  NN  and  wavelet  transforms  were  applied  to  SAR  land  area  classification 
[32].  In  this  study,  the  two  dimensional  discrete  wavelet  transform  (DWT)  algorithm  is  used 
to  extract  features  from  SAR  imagery  and  the  Multi-Layer  Percetpron  (MLP)  NN  algorithm  is 
used  to  classify  targets  based  on  the  extracted  features.  Seven  different  mother  wavelets  were 
applied  to  extract  wavelet  features  and  then  they  were  compared  with  each  other  to  select  the 
best  mother  wavelet  for  this  application.  Then  the  selected  mother  wavelet  was  used  to 
extract  features  from  SAR  targets  and  MLP  NN  was  used  to  classify  using  these  features. 
Preprocessing,  feature  extraction,  and  classification  will  be  discussed  in  the  next  three 
sections.  Included  in  the  discussion  of  feature  extraction  and  classification  is  a  discussion  of 
the  DWT  feature  extraction  and  MLP  classification  methods. 
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2.0  Detected  target 


A  SAR  scene  may  contain  many  man-made  objects,  as  shown  in  Figure  2a.  These  objects 
can  be  extracted  from  the  scene  either  by  applying  automated  detection  algorithm(s)  or  by 
manual  detection.  The  selected  object  is  called  the  detected  target  (target  chip)  because  this 
object  is  the  focus  of  attention  of  the  image  analyst  and/or  the  detection  algorithm(s)  for 
further  consideration.  The  target  chip  is  shown  in  Figure  2b. 


An  ATR  system  receives  the  detected  target  and  then  processes  it  to  determine  its  identity. 

The  output  of  the  system  is  a  list  of  known  target  identifications  and  the  confidence  associated 
with  these  identifications.  The  testing  and  training  target  chips  used  in  this  investigation  are 
128  pixels  by  128  pixels.  These  target  chips  are  high  quality  SAR  imagery  of  military  ground 
vehicles,  and  are  known  as  the  MSTAR  (Moving  and  Stationary  Target  Acquisition  and 
Recognition)  public  data  set. 


3.0  Preprocessing 


Some  feature  extraction  algorithms  and  classification  algorithms  are  sensitive  to  location  shift, 
rotation,  and  intensity.  Reducing  the  sensitivity  to  these  geometric  and  radiometric  variations 
can  enhance  the  accuracy  of  an  ATR  system.  The  following  is  a  description  of  the  pre¬ 
processing  techniques  used  in  this  study.  The  same  methods  were  used  by  English  [24]. 


For  this  study,  only  the  value  of  magnitude  was  considered  and  the  phase  information  ignored. 
All  the  target  chips  (training  and  testing)  were  converted  from  slant-range  to  ground-range 
using  the  following: 


G  = 


S 

Cos (0)  ’ 


(1) 


where  S  is  slant  range,  G  is  the  ground  range,  and  6  is  the  depression  angle.  Figure  3  shows 
the  relationship  between  these  three  parameters. 
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6  -  Depression  angle 


Figure  3:  Representation  of  slant  range  and  ground  range. 


In  the  MSTAR  dataset  the  target  contained  within  each  chip  is  oriented  independently  of  the 
other  chips.  To  bring  the  targets  into  a  standardized  target  orientation,  each  target  was  rotated 
to  a  vertical  orientation  with  the  front  end  of  the  target  facing  north.  For  this  rotation,  the 
orientation  angle  is  found  using  ground  truthing  information.  Figure  4a  and  Figure  4b  show 
the  target  position  before  and  after  rotation,  respectively. 


"N1  ■ . 


a) 


Figure  4:  Standardization  of  target  orientation  to  north  facing  position  a)  Target  BTR-70 
before  correcting  its  orientation  and  b)  after  rotating  the  BTR-70  to  north 

facing  orientation. 


After  rotating  to  the  standardized  orientation,  the  highest  energy  reflecting  point  of  the  target 
was  found  in  the  target  chip.  A  median  filter  was  applied  to  isolate  the  target  region  in  the 
target  chip,  and  a  search  was  subsequently  used  to  locate  the  highest  energy  return  point  in 
that  region.  The  energy  at  this  point  represents  the  highest  energy  reflected  towards  the  radar 
sensor  for  particular  sensor  depression  angle  and  target  orientation.  This  point  of  highest 
energy  was  then  used  as  the  centre  point  for  a  new  chip,  the  size  of  which  64  pixels  by  64 
pixels.  The  reduced  size  of  the  new  target  chip  was  more  than  enough  to  cover  any  target  in 
the  MSTAR  3  target  problem.  Figure  5  shows  the  new  target  chip  cropped  using  the  target 
chip  shown  in  Figure  4b. 
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Figure  5:  Size  of  reduced  target  chip  is  64  by  64  pixels. 


The  final  preprocessing  step  was  to  normalize  the  target  chips.  Normalization  alters  the  pixel 
values  such  that  the  mean  intensity  is  zero  and  the  standard  deviation  value  is  one  for  each 
chip.  This  was  done  by  subtracting  the  unsigned  value  of  the  mean  intensity  from  each  pixel 
intensity  and  dividing  the  result  by  the  standard  deviation  of  the  target  chip.  The 
mathematical  explanation  is  given  in  the  following  equation; 


(2) 


where  X  is  the  non-normalized  target  chip,  X’  is  the  normalized  target  chip,  X  is  the  mean 
intensity,  ax  is  standard  deviation  of  the  target  chip  X,  N  is  the  number  of  pixels  in  the  range 
direction,  and  M  is  the  number  of  pixels  in  the  cross  range  direction.  The  normalized  target 
chip  was  then  passed  on  to  feature  extraction  algorithms. 


4.0  Feature  Extraction  method 


Feature  extraction  is  one  of  the  important  steps  in  the  ATR  process.  Feature  extraction 
algorithms  extract  unique  information  or  a  signature  from  each  target.  A  very  good  feature 
extraction  algorithm  gives  smaller  variation  between  the  same  type  of  targets  and  larger 
variation  between  different  types  of  targets.  Selection  of  a  good  feature  extraction  algorithm 
is  important;  otherwise,  it  will  be  difficult  to  differentiate  between  targets  of  different  types 
and  misclassification  will  occur.  The  two-dimenstional  (2D)  Fast  Fourier  Transform  (FFT) 
feature  extraction  method,  which  was  used  in  [25],  is  briefly  described  in  Section  4.1.  The 
Wavelet  Transform  (WT)  feature  extraction  method  was  used  for  this  study  and  a  detailed 
description  is  given  in  Section  4.2. 
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4.1  2D  FFT  feature  extraction  method 


In  [25],  the  2D  FFT  was  used  to  extract  the  features  of  Fourier  coefficients  (FCs).  In  a  64 
pixel  by  64  pixel  image,  there  are  4096  FCs,  half  of  which  are  redundant.  Therefore,  only 
2048  FCs  actually  represent  the  64  pixel  by  64  pixel  image.  Using  all  of  these  FCs  features 
would  slow  down  the  classification  and  training  processes.  Therefore,  the  number  of  FCs  was 
reduced  to  16  for  each  target  type,  with  different  harmonics  used  for  each  target  type. 
Therefore,  the  best  16  harmonics  for  each  target  type  were  combined.  The  selection 
procedure  can  be  found  Ln  the  [25].  In  English’s  work  [24],  256  FCs  were  selected  for  each 
target  type,  but  the  256  harmonics  of  each  target  type  were  not  combined.  For  example,  the 
selected  256  FCs  for  T72  target  type  are  displayed  in  Figure  6. 


Figure  6:  Selected  Fourier  Coefficients  features’  location  for  the  T72. 

The  2D  Fourier  analysis  gives  the  frequency  response  of  an  image  and  depends  on  the 
periodic  components  that  occur  in  the  image.  Fourier  analysis  offers  good  frequency 
resolution  but  not  good  space  localization.  Therefore,  if  good  space  resolution  is  required 
another  method  must  be  used.  One  solution  is  to  use  all  pixels  in  the  target  chip  as  inputs  to 
the  classifier.  However  for  a  classifier  such  as  the  MLP  NN,  4096  inputs  (64  x  64)  is  too 
computationally  intensive.  In  addition,  this  approach  also  provides  the  classifiers  with  both 
target  and  non-target  information.  This  is  problematic  because  non-target  information  will 
cause  a  false  alarm  or  misclassification  problem.  An  alternative  approach  is  the  WT  as  it 
provides  both  spatial  and  frequency  localization  [33,34,35]. 


4.2  Two  dimensional  wavelet  feature  extraction  method 

The  WT  decomposes  the  original  image  into  several  sub  images  of  coarser  resolution  than  the 
original  image.  At  each  level  of  decomposition,  four  sub  images  (LL,  LH,  HL,  HH)  are 
obtained  according  to  the  mother  wavelet  function.  LL  contains  low  frequency  components  in 
horizontal  and  vertical  directions.  LH  contains  low  frequency  components  in  horizontal 
direction  and  high  frequency  components  in  vertical  direction.  HL  contains  high  frequency 
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components  in  horizontal  direction  and  low  frequency  components  in  vertical  direction.  HH 
contains  high  frequency  components  in  both  horizontal  and  vertical  directions.  The  resolution 
of  all  sub  images  is  reduce  to  a  size  that  is  one-fourth  the  size  of  the  original  image.  Fourier 
analysis  expresses  the  original  image  in  terms  of  a  sum  of  bases  functions.  These  bases 
functions  are  sinusoids  of  different  frequencies.  Similarly,  the  wavelet  analysis  expresses  the 
original  image  in  terms  of  sum  of  bases  functions,  but  these  base  functions  are  shifted  and 
scaled  versions  of  mother  wavelet.  Here,  we  consider  only  the  separable  2D  discrete  wavelet 
transform  because  it  can  be  computed  using  ID  scaling  and  wavelet  functions. 

The  2D  DWT  decomposes  an  image  in  terms  of  wavelet  and  scaling  functions  [36].  The 
following  equation  shows  the  2D  wavelet  decomposition  of  an  image  X(ui,u2),  X(u )e/r(s.K). 

X(u)  =  +  XX  ’  (3) 

keZ2  ft&B  j>jokGZ? 

where  h<E  B  \=  {LH,HL,HH) ,  <t>^k(u)  is  the  2D  dilated  and  translated  scaling  function,  a,o  are 

the  scaling  or  approximation  coefficients,  y/bjk  is  the  2D  translated  and  dilated  wavelet 

function,  d)  k  are  the  detail  or  wavelet  coefficients,  j  (/>  jo)  is  a  scale  factor,  k  (two 

dimensional  variable  &=(£/,&?))  is  the  shifting  factor  of  the  wavelet  and  scaling  functions, 
respectively,  and  j0  is  a  fixed  scale.  These  2D  functions  can  be  broken  down  into  the  product 
of  ID  functions  [36,  37]: 


<P O')  =  <t>  . 

(4) 

c 

V 

5 

C 

II 

O  < 

(5) 

(6) 

II 

(7) 

where  <t>jnki(u\)  and  ^/jH(«i)are  the  l  D  column  direction  scaling  and  wavelet  functions,  and 
(j>  £-,(1/2)  and  if/  jkl{ui)  are  the  ID  row  direction  scaling  and  wavelet  functions.  These  ID 
translated  and  dilated  functions  form  the  mother  wavelet  and  scaling  functions: 

^,o.„0)  =  2,0'2<K2  (8) 

ys  t  „(0  =  2‘/'  2^(2y  t  ~  rt),  =  k\,kz  and  t  —  n\,n:y  (9) 


where  </>  is  the  mother  scaling  function  and  the  ys  is  the  mother  wavelet  function.  The 
approximation  coefficients  a/0.*  and  the  detail  coefficients  dhjk  can  be  found  using  the 
following  equations  [38]  in  discrete  form: 


XX^-^  =>  LL> 

(10) 

XX*<"-">  Vj.kS11'}  =>  OH, 

Ifl  »2 

(11) 
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d%M)  =  XZT(w,’M;)  rjte&b  =>  HL’ 

HI  ill 

d'ukai)  =  XXy'r("',Mi)  =>  HH-  (l3) 

/M  «2 

The  2D  approximation  and  detail  coefficients  can  be  calculated  from  the  ID  scaling  and 
wavelet  functions.  This  separable  2D  discrete  wavelet  transform  can  be  computed  by 
applying  ID  low-pass  and  the  high-pass  digital  filters  [36,37,38,39]  to  an  image  X(U|,u2). 

The  low-pass  filter  represents  the  scaling  function  and  high-pass  filter  represents  the  wavelet 
function  [34].  This  decomposition  using  ID  filters  is  achieved  by  first  filtering  the  rows  of 
the  input  image  with  the  low  pass  and  high  pass  filters  and  decimating  the  result  by  2.  The 
columns  of  the  two  decimated  images  are  then  filtered  with  the  low  and  high  pass  filters  and 
are  decimated  by  2  again.  This  first-level  decomposition  produces  the  four  filtered  images 
discussed  above:  LL,  LH,  HL,  and  HH. 

The  processes  required  for  two  levels  of  wavelet  decomposition  are  shown  in  Figure  7.  LL 
and  LLLL  are  level  1  and  level  2  approximation  images.  LH,  HL  and  HH  are  the  level  1 
detail  images,  and  LLLH,  LLHL  and  LLHH  are  the  level  2  detail  images.  The  level  2 
approximation  image  contains  low  frequency  information  in  both  the  vertical  and  horizontal 
directions,  that  is  high  frequency  noise  is  removed  in  LLLL.  In  this  study,  the  level  2 
approximation  sub-band  image  (LLLL)  is  fed  into  the  classifier  for  identification.  The 
Matlab  wavelet  toolbox  is  used  to  extract  the  second  level  approximation  image.  The  sub¬ 
band  image  block  diagram  and  an  example  illustration  the  2D  WT  decomposition  are  shown 
in  Figure  8. 
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LLLL  LLLH  LLHL  LLHH  decomposition 


Lo  F 


1  D  Low  pass  filter 


2  1 


-  Decimated  by  2 


Hi  F 


1  D  High  pass  filter 


Figure  7:  Pyramidal  tree  structure  two  levels  of  2D  wavelet  decomposition  steps  for  images. 
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LLLL 

LLLH 

LH 

LLHL 

LLHB 

HL 

HH 

b) 


•  m  m  m  m  m 


C)  d) 

Figure  8:  Two  level  decomposition  of  discrete  wavelet  transform,  a)  Block  representation  of 
an  image,  b)  Block  Wavelet  representation  of  second  resolution  levels,  c) 
Original  BMP2  image,  d)  The  BMP  Wavelet  representation  of  second 

resolution  levels. 


In  Fourier  transformation,  images  breakdown  to  sinusoidal  and  cosine  functions  of  various 
frequencies  but  in  the  wavelet  transformation  images  breakdown  to  translated  and  scaled 
versions  of  the  mother  wavelet.  Many  different  numbers  of  wavelet  families  exist  and  the 
only  difference  between  these  mother  wavelets  are  on  shape  and  duration  of  the  waveform. 
Within  each  family  of  wavelets,  the  wavelets  are  classified  by  the  number  of  vanishing 
moments  (order  number).  This  indicates  the  smoothness  of  the  wavelet  and  flatness  of  the 
frequency  response  of  the  wavelet  filters.  But  all  the  mother  wavelets  do  not  have  vanishing 
moments.  We  have  investigated  seven  mother  wavelets  and  five  of  them  had  vanishing 
moments. 


4.2.1  Mother  wavelet  selection  method 


A  second  level  wavelet  decomposition  was  applied  to  the  training  set  using  the  Biorthogonal 
spline,  Coiflet,  Daubechies,  discrete  approximation  of  Meyer,  Haar,  Reverse  biorthogonal, 
and  Symlet  mother  wavelets  one  by  one  to  obtain  the  approximation  coefficients.  The 
standard  deviation  method  was  then  applied  to  the  approximation  coefficients  obtained  for 
each  wavelet  in  order  to  determine  the  best  mother  wavelet.  Variations  of  features  in  the 
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same  type  of  target  )  and  variations  of  mean  features  between  different  types  of  target 

( fjo2 )  were  measured  for  the  selection  process.  A  small  value  of  ^cr,  indicates  the  features  in 
the  same  type  of  target  are  invariant,  and  a  large  value  of  /ja2  indicates  the  feature  distances 

between  different  target  types  are  far  a  part  from  each  other.  The  mother  wavelet  which 
maximizes  the  distance  (/i<r7  -  /ycr, )  is  deemed  the  best  of  wavelet. 

The  mathematical  expressions  involved  with  computation  of  / j<j]  are  as  follows: 


jj(j\  = 


or, 


1 


N*  M 


It. 


N  M 

II- 

/=l  ,A 

(14) 

~  /'//J2 , 

(15) 

(16) 

In  (14)  N  is  the  number  of  target  types  (3  in  this  case),  M  is  the  number  of  features  (256  in  this 
case),  and  ojyis  the  standard  deviation  of  each  feature  for  each  target  type.  In  (15)  L  is  the 
number  samples  in  feature  j  and  target  type  i,  xs  is  the  wavelet  coefficients  of  the  sh  sample,  j'h 
feature  and  target  type  /,  and  u}j  is  the  mean  value  of  target  type  /  and  feature  j.  To  find  out 

the  variation  of  features  in  different  target  types,  the  standard  deviation  (au  j )  was  calculated 

using  each  feature  mean  value  as  shown  equation  17.  Then  mean  of  the  standard  deviation 
( /na2)  was  calculated  to  determine  the  average  variation  of  features  in  different  target  types 

using  equation  19. 


OfJ 


j  J  a,  * 


/=! 


/=! 


j= i 


(17) 

(18) 
(19) 


In  (19)  fj  j  is  the  mean  value  of  this  jh  feature  of  all  the  target  types,  and  the  remaining 

variables  are  as  defined  above.  The  best  mother  wavelet  was  selected  as  the  one  that 
produced  the  largest  variance  between  different  targets  types  with  the  same  feature  and  the 
smallest  variance  between  the  same  target  types  with  the  same  feature.  For  example,  assume 
}j(j1  and  jua{  have  been  calculated  for  three  different  mother  wavelets,  as  shown  in  Table  1. 

Figure  9(a)  shows  three  points  plotted  as  /jcj1  versus  jjo{  for  the  three  wavelets  (MW1,  MW2 
and  MW3).  At  the  point  MW1,  values  of  /ja2  and  //cr,  are  small,  indicating  that  this  is  not  a 
good  feature  because  it  is  very  difficult  to  distinguish  between  two  different  target  types.  The 


10 


DRDC  Ottawa  TM  2005-154 


point  MW3  has  larger  values  for  both  / ia2  and  pa{ ,  this  will  cause  a  misclassification 
problem.  The  point  MW2  is  a  compromise  on  the  inter-target  class  variability  (/ mj , )  and  the 
intra-target  class  variability  (pax ).  Therefore,  MW2  is  the  best  of  the  three  mother  wavelets. 
A  minimum  distance  method  can  be  used  to  determine  the  best  mother  wavelet.  After 
selecting  a  reference  point  (RP)  such  as  minimum  point  of  pax  and  maximum  point 

of  Licr j  from  the  experimented  mother  wavelets,  the  distance  between  each  point  (pax ,  pa 7 ) 

and  the  RP  is  computed.  The  mother  wavelet  with  the  minimum  distance  is  considered  the 
best.  In  the  case  of  MW1,  MW2  and  MW3  the  minimum  distance  occurred  for  MW2,  thus 
supporting  the  original  observation  that  MW2  was  the  best  wavelet.  Figure  9(b)  shows  the 
distances  computed  for  each  mother  wavelet.  This  distance  method  was  followed  for 
selecting  the  best  wavelet  in  this  study. 


Table  1 :  Example  -  pa2  and  pa}  values  for  three  different  mother  wavelets. 


MOTHER  WAVELET 

per  2 

pax 

MW1 

0.1 

0.1 

MW2 

0.5 

0.5 

MW3 

0.9 

0.9 

(a)  (b) 

Figure  9:  Example  -  pa2  and  pa]  plots  of  three  different  mother  wavelets,  (a)  Three 

mother  wavelet  points  (b)  Distance  between  reference  point  and  each 
mother  wavelet  point 


The  mother  wavelets,  which  were  subject  to  this  experiment,  are  listed  in  Table  2.  Features 
were  extracted  using  each  mother  wavelet  in  Table  2  in  order  to  calculate  pax  and  pa2 . 
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Table  2:  Mother  wavelets  tested  for  the  selection  of  the  best  mother  wavelet  for 

classification. 


MOTHER  WAVELETS 

ORDER 

1 

Biorthogonal  spline 

Order  1.1,  3.1  and  6.8  are  used  in  this  analysis 

2 

Coiflets 

Order  1,  3  and  5  are  used  in  this  analysis 

3 

Daubechies 

Order  2,  24  and  45  are  used  in  this  analysis 

4 

Discrete  approximation  of  Meyer 

Order  3  is  used  in  this  analysis 

5 

Haar 

Order  1  is  used  in  this  analysis 

6 

Reverse  biorthogonal 

Order  1.3,  3.1  and  6.8  are  used  in  this  analysis 

7 

Symlets 

Order  1,  10  and  20  are  used  in  this  analysis 

5.0  MLP  Neural  Network  Classification 


Neural  Networks  are  characterized  by  two  phases;  the  learning  phase  and  the  responding 
phase.  The  learning  process  can  be  achieved  using  either  supervised  or  unsupervised  learning 
methods.  In  supervised  learning  methods,  the  inputs  and  desired  outputs  arc  known  to  the 
NN.  The  learning  process  adjusts  weights  (parameters)  to  get  desired  outputs  from  inputs.  In 
unsupervised  learning  methods,  only  the  inputs  are  known  to  the  NN.  The  learning  process 
adjusts  weights  to  group  similar  patterns  of  inputs  to  same  output  nodes.  The  number  of 
groups  depends  on  number  nodes  that  exist  in  the  output  layer.  In  this  work,  the  supervised 
method  is  followed.  During  the  learning  phase,  the  NN  learns  the  relationship  between  the 
information  given  at  the  input  and  the  information  requested  from  the  output.  In  the 
responding  phase,  test  data  similar  to  the  learning  input  data  is  fed  to  the  input.  Based  on  the 
learned  response  the  NN  then  predicts  the  response  to  the  test  data. 


Input  layer 


First  hidden 
layer 


Second  hidden 
layer 


Output  layer 


Figure  10:  Four  layer  Multi  Layer  Perceptron  Neural  Network. 
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The  most  commonly  used  nonlinear  regression  model  is  the  MLP  NTN  [42],  which  is  capable 
of  learning  nonlinear  function  mappings  [41].  The  MLP  NN  has  been  used  more  extensively 
in  various  problems  [43]  than  any  other  NN  [40].  The  typical  MLP  NN  is  composed  of  an 
input  layer,  an  output  layer,  and  at  least  one  hidden  layer  [40].  In  this  work,  the  MLP  neural 
network  used  a  standard  back-propagation  algorithm  called,  the  delta  rule  algorithm  [44],  to 
update  the  weights.  Details  of  the  derivation  for  the  training  and  testing  algorithms  are  not 
discussed  here.  The  MLP  NN  was  configured  as  illustrated  in  Figure  10,  with  13  nodes  in  the 
first  hidden  layer,  1 1  nodes  in  the  second  hidden  layer,  256  nodes  in  the  input  layer  and  3 
nodes  in  the  output  layer.  The  number  of  input  nodes  depends  on  the  number  of  features 
chosen.  In  this  work,  256  features  were  chosen;  therefore,  256  nodes  should  be  included  in 
the  input  layer  of  the  NN.  Because  in  this  application  the  NN  was  use  to  classify  the  three 
military  ground  vehicles,  a  BMP-2,  BTR-70  and  T-72,  three  nodes  were  required  in  the  output 
layer.  Node  1  was  assigned  to  T-72,  Node  2  was  assigned  to  BTR-70,  and  Node  3  was 
assigned  to  BMP-2.  The  number  of  hidden  layers  and  the  number  of  nodes  in  each  hidden 
layer  were  decided  using  empirical  testing  of  the  NN;  these  are  not  optimized  numbers. 


6.0  Results  and  Discussion 


As  mentioned  in  section  2.0,  various  mother  wavelets  were  applied  to  the  public  MSTAR  data 
set  to  determine  the  best  mother  wavelet  for  this  application.  256  features  were  then  extracted 
from  each  target  chip  using  the  selected  mother  wavelet.  Features  extracted  from  the  training 
data  were  used  to  train  the  MLP  NN,  and  features  extracted  from  the  testing  data  were  used 
for  testing  the  trained  MLP  NN.  This  section  starts  with  brief  discussion  about  the  experiment 
data,  followed  by  the  selection  of  mother  wavelets,  and  the  classification  results  of  the  2D 
wavelet  features. 


6.1  Experimented  data  set 

The  United  States  (US)  Defense  Advanced  Research  Projects  Agency  (DARPA)  has  made 
part  of  the  MSTAR  data  set  available  to  the  public.  The  MSTAR  public  data  set  contains 
many  spotlight  SAR  vehicle  images  including  10  types  of  former  Soviet  Union  vehicles,  with 
the  azimuthal  angle  ranging  between  0  and  360  degrees,  and  depression  angles  of  15  and  17 
degrees.  Three  types  of  vehicles  (BMP-2,  BTR-70  and  T-72)  are  investigated  in  this  analysis. 
All  three  types  of  vehicles,  which  were  imaged  at  a  17-degree  depression  angle,  and  all 
azimuthal  angles,  were  used  for  training  the  MLP  NN.  The  target  ty  pes  with  serial  number, 
and  the  number  of  samples  used  for  training,  are  listed  in  the  Table  3.  Similarly,  all  three 
types  of  vehicles,  which  were  imaged  in  15  degree  and  all  azimuthal  angles,  were  used  for 
testing  the  2D  wavelet  feature  extraction  and  the  MLP  NN  classifier.  The  target  types  with 
serial  numbers  and  the  number  of  samples  used  for  testing  are  listed  in  the  Table  4. 
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Table  3:  Details  of  the  MSTAR  targets  used  as  the  training  Set. 


TARGETS  TYPE  AND  SERIAL 
NUMBER 

#  OF 

SAMPLES 

COMMENTS 

T-72  (132) 

232 

All  the  targets  collected  at  17 

BTR  -  70  (c72) 

233 

degree  depression  angle,  full 

BMP -2  (9563) 

233 

aspect  coverage  and  30  cm 

Total 

=  698 

resolution 

Table  4:  Details  of  MSTAR  targets  used  as  testing  samples. 


TARGETS  TYPE  AND  SERIAL 
NUMBER 

#  OF 

SAMPLES 

COMMENTS 

T-72  (812)  ~ 

195 

All  the  targets  collected  at  15 
degree  depression  angle,  full 
aspect  coverage,  and  30  cm 
resolution 

T-72  (s7) 

191 

T-72  (132) 

196 

BTR-70  (c72) 

196 

BMP-2  (9563) 

195 

BMP-2  (9566) 

196 

BMP-2  (c21) 

196 

Total  =1365 

Confuser  targets  were  used  to  generate  Receiver  Operation  Characteristic  (ROC)  curves  for 
evaluate  the  classifier  performance.  Two  types  of  confusers  were  used,  the  2S1  self- 
propelled  howitzer,  and  the  D7  bulldozer.  The  confuser  targets  information  is  listed  on  the 


Table  5. 


Table  5:  Confusers  used  for  ROC  curve  test. 


TARGETS  TYPE  AND 

#  OF 

COMMENTS 

SERIAL  NUMBER 

SAMPLES 

2S1 

274 

All  the  confusers  collected  at  15  degree 

D7 

274 

depression  angle,  full  aspect  coverage, 

Total 

=  548 

and  one  foot  resolution 

6.2  Selection  of  Mother  wavelet 

The  Matlab  wavelet  toolbox  was  used  to  extract  the  wavelet  coefficients  from  each  image. 
Different  mother  wavelets  were  tested  and  the  rows  of  fuoj  and  /io?  were  calculated  for  each 
wavelet  as  per  equations  14  and  19.  The  calculated  values  are  listed  in  Table  6.  These  points 
are  plotted  in  Figure  1 1  as  per  the  method  described  in  Section  4.2.1.  The  RP  was  determined 
using  the  minimum  and  maximum  values  of  fiat  and  pioj  respectively.  From  the  Table  6,  the 
minimum  value  of tuoj  is  0.999  and  maximum  value  of  jjo2  is  1.3471,  therefore,  the  RP  is 
(0.999,  1.347)  and  it  is  denoted  as  RP  on  Figure  1 1.  The  distance  from  each  point  in  Figure 
1 1  to  the  RP  was  calculated  and  is  listed  in  Table  7.  An  examination  of  the  distances  shows 
that  the  minimum  distance  corresponds  to  the  Reverse  Biorthogonal  Wavelet  Pairs  with  order 
of  3. 1 .  Thus,  this  wavelet  was  selected  for  further  analysis. 
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Table  6:  Result  of  feature  extraction  using  various  mother  wavelets. 


MOTHER  WAVELETS 

#  OF  SAMPLES 

fjcr-. 

JJG\ 

Name 

Id 

Order 

Biorthogonal  spline 

PI 

1.1 

0.391 

1.514 

P2 

3.1 

0.587 

3.458 

P3 

6.8 

0.245 

1.097 

Coiflets 

P4 

1 

0.337 

1.375 

P5 

3 

0.241 

1.063 

P6 

5 

0.222 

1.021 

Daubechies 

P7 

2 

0.352 

1.428 

P8 

24 

0.538 

2.020 

P9 

45 

1.347 

5.887 

Discrete  approximation  of  Meyer 

P10 

N/A 

0.363 

1.519 

Haar 

P11 

N/A 

0.391 

1.514 

Reverse  Biorthogonal  Wavelet  Pairs 

P12 

1.3 

0.315 

1.319 

P13 

3.1 

0.293 

1.130 

P14 

6.8 

0.233 

1.014 

Symlets 

P15 

1 

0.391 

1.514 

P16 

10 

0.226 

1.014 

PI  7 

20 

0.214 

0.999 

4  - 


3  - 


Rp  , 

L . 

o 

. pg.„ 

PI  3 

< 

1,06 

&<&>  ] 

i _ i 

I3  P8 

i _ 

o  p2 

l _ i 

2  ^  3 


Figure  1 1 :  ju<j2  and  ^cr,  plots  of  tested  mother  wavelets. 
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Table  7:  Short  distance  from  each  mother  wavelet  ( /jcrl ,  //<r2 )  point  to  RP. 


MOTHER  WAVELETS 

DISTANCE  FROM  RP 

Name 

Order 

Biorthogonal  spline 

1.1 

1.086 

3.1 

2.573 

6.8 

1.107 

Coiflets 

1 

1.077 

3 

1.108 

5 

1.125 

Daubechies 

2 

1.084 

24 

1.302 

45 

4.888 

Discrete  approximation  of  Meyer 

N/A 

1.113 

Haar 

N/A 

1.086 

Reverse  Biorthogonal  Wavelet  Pairs 

1.3 

1.081 

3.1 

1.062 

6.8 

1.114 

Symlets 

1 

1.086 

10 

1.121 

20 

1.133 

6.3  Classification  results  and  discussion 

Section  6.2  showed  that,  of  the  wavelets  tested,  the  Reverse  Biorthogonal  Wavelet  had  the 
best  performance.  Therefore,  this  wavelet  was  used  to  extract  features  from  the  target  images. 
The  two-hidden-layer  MLP  NN  classifier  was  then  used  to  classify  the  targets.  First  the 
wavelet  feature  extraction  algorithm  was  applied  to  the  training  data  set  (Table  3)  and  then  the 
MLP  NN  was  trained  on  the  extracted  features  so  it  could  learn  to  distinguish  between  target 
types.  After  training,  the  performance  of  the  MLP  NN  was  evaluated  using  the  testing  set 
(Table  4)  and  confuser  set  (Table  5).  The  evaluation  was  conducted  using  the  confusion 
matrix  and  the  ROC  curves.  A  complete  explanation  of  this  evaluation  technique  can  be 
found  in  [45],  which  is  a  NATO  working  paper.  In  the  ROC  curve,  the  y-axis  represents  the 
number  of  targets  correctly  detected  divided  by  the  total  number  of  same  type  of  targets 
presented  to  the  classifier  (as  a  percentage),  and  the  x-axis  represents  the  number  of  confusers 
detected  divided  by  the  total  number  of  confusers  presented  to  the  classifier  (as  a  percentage). 
These  points  are  calculated  by  changing  the  output  threshold  value  of  the  MLP  NN  classifier 
from  0  to  1  in  increments  of  0.01. 

There  are  three  output  nodes  in  the  MLP  NN’s  output  layer,  and  each  node  represents  a 
particular  target  type  (Node  1  for  T-72,  Node  2  for  BTR-70  and  Node  3  for  BMP-2). 

Assigned  to  each  node  is  a  particular  threshold  value.  When  a  target  is  tested,  each  of  the 


16 


DRDC  Ottawa  TM  2005-154 


three  output  nodes  of  the  HN  produces  a  value  between  0  and  1.  The  output  value  of  each 
node  is  then  compared  to  the  threshold  value  of  that  node.  A  target  is  considered  similar  to  a 
trained  target  type  if  the  output  value  is  greater  than  or  equal  to  the  threshold  value,  and 
dissimilar  to  a  trained  target  type  if  the  output  value  is  less  than  the  threshold  value.  In  the 
event  that  two  output  nodes  accept  a  target,  the  node  with  the  greatest  difference  between  the 
calculated  and  defined  threshold  values  wins. 

One  of  the  experiments  was  conducted  by  setting  the  threshold  value  for  each  output  to  zero. 
This  ensures  the  classifier  will  classify  the  target  into  one  of  the  class  types,  because  output 
values  will  always  be  between  0  and  1  inclusive.  This  decision  indicates  which  tested  targets 
are  more  similar  to  which  of  the  trained  target  types.  Table  8  shows  the  confusion  matrix. 
The  asterisks  in  the  table  indicate  vehicle  types  with  serial  number,  used  for  training.  The 
percentage  of  correct  classification  rate  (Pec)  is  higher  for  the  vehicle  types  (same  serial 
number)  used  in  training  compared  to  the  other  targets.  For  each  target,  the  correct 
recognition  rate  is  shown  in  bold.  Overall,  the  PCc  is  84.2%.  According  to  the  result  in  the 
table,  the  confusers  D7  and  2S1  are  more  similar  to  the  T72  vehicle  type. 


Table  8:  Confusion  Matrix  for  Pd  of  1 . 


TARGETS  TYPE  AND  SERIAL 
NUMBER 

BMP2 

BTR70 

T72 

Pcc  (%) 

BMP2  (SN_9563) 

144 

7 

44 

73.85 

BMP2  (SN 9566) 

146 

6 

44 

74.49 

BMP2  (SN_C21)* 

185 

0 

11 

94.39 

BTR70  (SN_C71)* 

7 

185 

4 

94.39 

T72  (SN_132)* 

7 

1 

188 

95.92 

T72  (SN_S7) 

36 

8 

147 

76.96 

T72  (SN 812) 

39 

2 

154 

78.97 

2S1 

69 

60 

145 

D7 

21 

1 

252 

_ 

Overall  Pcc 

84.18 

The  percentage  of  false  alarm  rate  (Pf)  was  calculated  using  the  con  fuser  targets  in  Table  5 
and  the  percentage  of  correct  detection  rate  (Pd)  was  calculated  using  the  testing  targets  in 
Table  4.  The  Pf  and  Pd  rates  were  computed  for  different  output  threshold  values  and  then 
plotted  as  Pd  vs.  Pf  as  shown  in  Figure  12.  In  this  plot,  the  lower  left  and  the  upper  right 
corner  points  were  generated  at  the  output  threshold  value  of  1  and  0  respectively,  and  the  rest 
of  the  points  were  generated  at  threshold  values  between  0  and  1.  This  graph  is  called  a  ROC 
curve.  The  Pr  and  the  Pd  were  calculated  separately  for  each  target  type.  For  a  random 
classifier,  the  values  of  the  Pd  and  Pr  are  approximately  equal  for  different  output  threshold 
values  (the  area  under  the  curve  of  a  random  classifier  is  approximately  0.5).  The  broken  line 
shown  in  the  Figure  12  is  the  random  classifier  line.  Any  curve  below  this  line  is  not  good 
because  the  Pd  is  less  than  the  Pf.  In  this  study  all  the  classifiers  produced  ROC  curves  above 


DRDC  Ottawa  TM  2005-154 


17 


the  random  classifier  line.  The  performance  of  the  classifier  is  best  when  the  tested  targets 
belong  to  the  same  class  of  targets  used  in  training. 


BMP 2.  BTR70  and  T72  dassific  iDon  ROC  curv* 
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Figure  1 2:  ROC  curve  for  the  BMP2,  BTR70  and  T72  target  types. 


The  threshold  values  at  the  output  nodes  were  adjusted  to  ensure  a  Pd  of  0.9.  Typically  a  Pd  of 
0.9  is  used  in  the  MSTAR  evaluation  method  as  a  standard  operating  point  [25,46].  Tabic  9 
shows  the  confusion  matrix  that  was  produced  for  this  Pd.  The  ratio  of  the  total  number  of 
correct  classifications  to  the  total  number  of  targets  detected  (Pcc/d)  is  highest  when  the  serial 
number  of  the  test  target  is  the  same  as  the  serial  number  of  the  trained  target  (BMP2 
(SN_C21),  BTR70  (SN_C71),  and  T72  (SN_  1 32)).  To  compare  the  wavelet  features  with  2D 
FFT  features,  classification  based  on  the  2D  FFT  features  was  performed  The  results  for  a  Pd 
of  0.9  are  shown  in  the  Table  10.  The  overall  Pcc,d  is  slightly  higher  for  the  wavelet-based 
features.  The  false  alarm  rate  is  lower  for  the  wavelet-based  features  (P,  is  0.85  at  Pd  of  0.9) 
than  the  2D  FFT-based  features  (Pf  is  0.89  at  Pd  of  0.9), 


Table  9:  Confusion  Matrix  at  Pd  is  0.9. 


Targets  type  and  serial  number 

BMP2 

BTR70 

T72 

Others 

BMP2  (SN  9563) 

110 

4 

35 

46 

BMP2  (SN  9566) 

122 

2 

39 

33 

BMP2  (SN  C21)* 

178 

0 

10 

8 

BTR70  (  SN  C7 1  )* 

1 

178 

3 

14 

T72  (SN  132)* 

5 

0 

187 

4 

T72  (SN  S7) 

29 

7 

140 

15 

T72  ( SN 8 1 2) 

31 

1 

147 

16 

2S1 

31 

47 

131 

65 

D7 

10 

i 

244 

19 

Overall  Pcci]  =  86.41% 
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Table  10:  MLP  Confusion  matrix  for  2D  FFT  features  at  Pd  is  0.9  ([25]). 


Targets  type  and  serial  number 

BMP2 

BTR70 

T72 

Others 

T-72(812) 

141 

13 

15 

26 

T-72(S7) 

122 

13 

30 

26 

T-72(132) 

177 

3 

5 

11 

BTR-70(C72) 

3 

183 

0 

10 

BMP-2(9563) 

5 

5 

170  ' 

15 

BMP-2(9566) 

31 

16 

121 

28 

BMP-2(C21) 

23 

4 

152 

17 

2S1 

45 

93 

87 

49 

D7 

22 

2 

238 

12 

Overall  Pcc/d  =  85.20% 

An  ideal  classifier’s  ROC  curve  should  go  through  the  point  (0,1),  which  represents  a  zero 
false  alarm  rate  and  a  100%  detection  rate.  Therefore,  the  point,  which  is  closest  to  (0,1),  is 
the  best  threshold  value  available  to  get  a  higher  detection  rate  and  a  lower  false  alarm  rate. 
The  best  threshold  values  found  were  0.9906  for  node  1  (T-72),  0.9937  for  node  2  (BTR-70), 
and  0.9805  (BMP-2).  The  confusion  matrix  produced  based  on  these  values  and  the  results 
are  listed  in  the  Table  1 1.  The  Pd  is  62.3%  and  the  Pf  is  41 .2%.  The  overall  Pcc/d  is  93.2%. 


Table  1 1 :  Confusion  Matrix  at  lower  false  alarm  and  maximum  detection  rate. 


Targets  type  and  serial  number 

BMP2 

BTR70 

T72 

Others 

BMP2  (SN  9563) 

76 

0 

9 

110 

BMP2  (SN  9566) 

84 

0 

10 

102 

BMP2  (SN  C21)* 

151 

0 

5 

40 

BTR70  (SN  C7I)* 

1 

134 

2 

59 

T72  (SN  132)* 

1 

0 

167 

28 

T72  (SN  S7) 

15 

2 

90 

84 

T72  (SN  812) 

13 

0 

90 

92 

2S1 

5 

15 

74 

180 

D7 

3 

i 

128 

142 

Overall  Pcc/d  =  93.18% 

Although  the  P,  has  been  reduced,  it  is  still  quite  high.  To  further  reduce  the  P*  rate,  slicy 
confusers  (shown  in  the  Figure  13)  were  included  in  the  training  set  and  the  MLP  NN  was 
retrained.  The  slicy  confuser  is  a  man-made  object  and  this  ATR  system  was  not  intended  to 
detect  or  classify  this  slicy  confuser.  The  trained  MLP  NN  was  evaluated  using  the  testing  set 
data  from  Table  4  and  confusers  from  Table  5.  A  ROC  curve  and  the  confusion  matrix  for  a 
Pd  of  0.9  were  generated.  The  ROC  curve  is  shown  in  the  Figure  14  and  the  confusion  matrix 
is  shown  in  the  Table  12. 
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E3MP2.  BTR70  and  T72  classification  ROC  curw 


Figure  14:  ROC  curve  for  the  BMP2,  BTR70  and  T72  target  types  -  Slicy  confusers  are 

included  in  the  training  set. 


Table  12:  Confusion  Matrix  for  Pd  of  0.9  when  Slicy  confusers  were  included  in  the 

training  set. 


Targets  type  and  serial  number 

BMP2 

BTR70 

T72 

Others 

BMP2  (SN  9563) 

124 

5 

35 

31 

BMP2  (SN  9566) 

122 

5 

50 

19 

BMP2  (SN  C21)* 

172 

0 

13 

11 

BTR70  (SN  C71)* 

1 

181 

3 

11 

T72  (SN  132)* 

3 

0 

186 

7 

T72  (SN  S7) 

27 

6 

134 

24 

T72  (SN812) 

23 

0 

139 

33 

2S1 

42 

68 

116 

48 

D7 

15 

1 

219 

39 

Over  all  Pcc/d  =  86.09% 
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The  ROC  curve  shown  in  Figure  14  shows  no  improvement  over  that  in  Figure  12,  and  the 
overall  Pcc/d  from  Table  12  (86.1%)  is  actually  less  than  that  shown  in  Table  10  (86.46%). 

The  reason  for  this  result  is  explained  as  follows.  Assume  ‘x’  represents  BMP-2,  ‘o’ 
represents  T-72  and  represents  BTR-70,  as  shown  in  Figure  15.  Before  including 
confusers  in  the  training  set,  the  trained  MLP  NN  covers  a  small  area  for  each  type  of  target 
(Area  1  for  target  type  BMP-2,  Area  2  for  target  type  T-72,  and  Area  3  for  BTR-70)  as  shown 
in  Figure.  After  the  inclusion  of  confusers  in  the  training  set,  the  trained  MLP  NN  covers  a 
different  area,  as  shown  in  Figure  15b.  Consider  a  target  tested  under  both  cases.  The  tested 
target  belongs  to  BMP-2  type  and  it  is  marked  in  blue  color  in  the  Figure  15.  The  tested  target 
is  located  inside  the  Area  1  in  Figure  15a  and  the  MLP  NN  classifier  will  recognize  it 
correctly.  In  the  Figure  15b,  the  tested  target  is  located  out  side  the  Area  1  and  the  classifier 
will  recognize  it  as  an  other  or  unknown  target  type.  Consequently,  the  overall  Pcc/cj  will  be 
reduced  when  confursers  are  included  in  the  training  set. 


Figure  1 5:  Example  of  training  cover  area  before  and  after  inclusion  of  confusers  in  training 
set.  (a)  Before  inclusion  of  confusers.  (b)  After  inclusion  of  confusers. 


Including  different  type  of  confusers  (Slicy)  in  the  training  set  reduces  the  false  alarm  rate 
slightly.  Therefore,  the  same  types  of  confusers,  imaged  at  a  17°  depression  angle,  were  used 
in  the  training  set  to  train  the  MLP  NN  to  further  reduce  the  false  alarm  rate.  The  test  set 
confusers  was  imaged  at  a  15°  depression  angle.  Again  the  MLP  NN  was  trained  and  then 
evaluated  using  the  ROC  curve  and  confusion  matrix  at  0.9  Pd.  The  ROC  curve  is  shown  in 
Figure  14  and  the  confusion  matrix  is  shown  in  Table  13.  This  ROC  curve  is  clearly  better 
than  the  previous  ROC  curves.  That  is,  these  curves  pass  closer  to  the  ideal  ROC  curve  than 
the  previous  ROC  curves.  The  confusion  matrix  shows  that  the  overall  Pcc/d  is  84.7%,  which 
is  less  than  that  of  the  previous  Pcc/d  values.  However,  the  Pf  value  is  reduced  to  5.7%.  This 
shows  that  if  similar  type  of  confusers  are  added  to  the  training  set  then  the  false  alarm  rate 
will  be  reduced. 
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Table  1 3:  Confusion  Matrix  at  Pd  of  0.9  -  2S1  and  D7  confusers  are  included  in  the 

training  set. 


Targets  type  and  serial  number 

BMP2 

BTR70 

T72 

Others 

BMP2  (SN  9563) 

124 

9 

47 

15 

BMP2  (SN  9566) 

116 

7 

45 

28 

BMP2  (SN  C21)* 

186 

0 

8 

2 

BTR70  (SN  C71)* 

1 

188 

4 

3 

T72  (SN  132)* 

4 

0 

192 

0 

T72  (SN  S7) 

19 

4 

121 

47 

T72  (SN  812) 

36 

4 

114 

41 

2S1 

3 

7 

14 

250 

D7 

3 

1 

3 

267 

Overall  Pcc/d  =  84.70% 

BMP2,  BTR70  and  T72  classification  ROC  curv'd 


Figure  1 6:  ROC  curve  for  the  BMP2,  BTR70  and  T72  target  types  -  2S1  and  D7  confusers 

are  included  in  the  training  set. 
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7.0  Conclusion 


In  this  study  comparison  of  seven  different  mother  wavelets  were  conducted  using  the  3-class 
MSTAR  SAR  dataset.  The  Reverse  Biorthogonal  Wavelet  Pairs  mother  wavelet  performed 
better  than  all  the  other  wavelets  tested  based  on  the  following  combinations;  minimum 
variation  within  the  same  class  type  and  maximum  variation  between  different  class  types. 
Compared  to  the  2D  FFT  features,  second  level  wavelet  decomposition  features  only  showed 
marginal  improvement  in  classification  performance.  The  MLP  NN  classifier  has  better 
detection  and  lower  false  alarm  than  the  random  classifier  while  using  the  Reverse 
Biorthogonal  Wavelet  Pairs  features. 

The  MLP  NN  classifier  was  trained  with  three  different  combinations  of  datasets.  One  dataset 
contained  all  the  training  samples  (Table  3),  second  dataset  contained  all  the  training  samples 
and  slicy  confuser,  and  third  dataset  contained  all  the  training  samples  and  similar  confusers 
as  tested  confusers  but  imaged  in  different  depression  angle.  The  correct  classification  rate 
has  not  varied  noticeably  for  all  the  three  datasets  but  the  false  alarm  ratio  was  highly  reduced 
in  the  third  dataset  and  slightly  reduced  in  the  second  dataset  compared  to  first  dataset.  This 
experiment  shows  that  adding  the  expected  confusers  in  the  training  set  will  reduce  the  false 
alarm  ratio. 

Future  work  in  this  area  may  be  the  fusion  of  features  produced  by  FFT  and  WT  feature 
extractors.  By  fusion  of  these  algorithms,  many  numbers  of  features  will  be  extracted  from 
each  target.  This  will  give  high  variations  between  the  targets  of  the  same  class  and  low 
variations  between  the  targets  of  the  different  classes.  Therefore,  we  have  to  use  few 
numbers  of  features  but  dominant  features  from  the  both  feature  extraction  methods.  In  this 
study,  only  second  level  of  approximation  coefficients  were  considered  and  the  details  of 
coefficients  were  ignored.  Therefore,  WT  features  should  be  investigated  in  more  details  to 
extract  meaningful  features  for  SAR  ATR  application. 
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